U-shaped networks are widely used in various medical image tasks, such as segmentation, restoration and reconstruction, but most of them usually rely on centralized learning and thus ignore privacy issues. To address the privacy concerns, federated learning (FL) and split learning (SL) have attracted increasing attention. However, it is hard for both FL and SL to balance the local computational cost, model privacy and parallel training simultaneously. To achieve this goal, in this paper, we propose Robust Split Federated Learning (RoS-FL) for U-shaped medical image networks, which is a novel hybrid learning paradigm of FL and SL. Previous works cannot preserve the data privacy, including the input, model parameters, label and output simultaneously. To effectively deal with all of them, we design a novel splitting method for U-shaped medical image networks, which splits the network into three parts hosted by different parties. Besides, the distributed learning methods usually suffer from a drift between local and global models caused by data heterogeneity. Based on this consideration, we propose a dynamic weight correction strategy (\textbf{DWCS}) to stabilize the training process and avoid model drift. Specifically, a weight correction loss is designed to quantify the drift between the models from two adjacent communication rounds. By minimizing this loss, a correction model is obtained. Then we treat the weighted sum of correction model and final round models as the result. The effectiveness of the proposed RoS-FL is supported by extensive experimental results on different tasks. Related codes will be released at https://github.com/Zi-YuanYang/RoS-FL.
translated by 谷歌翻译
Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers. BLOOM is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total). We find that BLOOM achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning. To facilitate future research and applications using LLMs, we publicly release our models and code under the Responsible AI License.
translated by 谷歌翻译
Image super-resolution is a common task on mobile and IoT devices, where one often needs to upscale and enhance low-resolution images and video frames. While numerous solutions have been proposed for this problem in the past, they are usually not compatible with low-power mobile NPUs having many computational and memory constraints. In this Mobile AI challenge, we address this problem and propose the participants to design an efficient quantized image super-resolution solution that can demonstrate a real-time performance on mobile NPUs. The participants were provided with the DIV2K dataset and trained INT8 models to do a high-quality 3X image upscaling. The runtime of all models was evaluated on the Synaptics VS680 Smart Home board with a dedicated edge NPU capable of accelerating quantized neural networks. All proposed solutions are fully compatible with the above NPU, demonstrating an up to 60 FPS rate when reconstructing Full HD resolution images. A detailed description of all models developed in the challenge is provided in this paper.
translated by 谷歌翻译
Generative Adversarial Networks (GANs) typically suffer from overfitting when limited training data is available. To facilitate GAN training, current methods propose to use data-specific augmentation techniques. Despite the effectiveness, it is difficult for these methods to scale to practical applications. In this work, we present ScoreMix, a novel and scalable data augmentation approach for various image synthesis tasks. We first produce augmented samples using the convex combinations of the real samples. Then, we optimize the augmented samples by minimizing the norms of the data scores, i.e., the gradients of the log-density functions. This procedure enforces the augmented samples close to the data manifold. To estimate the scores, we train a deep estimation network with multi-scale score matching. For different image synthesis tasks, we train the score estimation network using different data. We do not require the tuning of the hyperparameters or modifications to the network architecture. The ScoreMix method effectively increases the diversity of data and reduces the overfitting problem. Moreover, it can be easily incorporated into existing GAN models with minor modifications. Experimental results on numerous tasks demonstrate that GAN models equipped with the ScoreMix method achieve significant improvements.
translated by 谷歌翻译
背景和目标:现有的医学图像分割的深度学习平台主要集中于完全监督的细分,该分段假设可以使用充分而准确的像素级注释。我们旨在开发一种新的深度学习工具包,以支持对医学图像分割的注释有效学习,该学习可以加速并简单地开发具有有限注释预算的深度学习模型,例如,从部分,稀疏或嘈杂的注释中学习。方法:我们提出的名为Pymic的工具包是用于医学图像分割任务的模块化深度学习平台。除了支持开发高性能模型以进行全面监督分割的基本组件外,它还包含几个高级组件,这些高级组件是针对从不完善的注释中学习的几个高级组件,例如加载带注释和未经通知的图像,未经通知的,部分或无效的注释图像的损失功能,以及多个网络之间共同学习的培训程序。Pymic构建了Pytorch框架,并支持半监督,弱监督和噪声的学习方法用于医学图像分割。结果:我们介绍了基于PYMIC的四个说明性医学图像细分任务:(1)在完全监督的学习上实现竞争性能; (2)半监督心脏结构分割,只有10%的训练图像; (3)使用涂鸦注释弱监督的分割; (4)从嘈杂的标签中学习以进行胸部X光片分割。结论:Pymic工具包易于使用,并促进具有不完美注释的医学图像分割模型的有效开发。它是模块化和灵活的,它使研究人员能够开发出低注释成本的高性能模型。源代码可在以下网址获得:https://github.com/hilab-git/pymic。
translated by 谷歌翻译
深度神经网络中的建筑进步导致了跨越一系列计算机视觉任务的巨大飞跃。神经建筑搜索(NAS)并没有依靠人类的专业知识,而是成为自动化建筑设计的有前途的途径。尽管图像分类的最新成就提出了机会,但NAS的承诺尚未对更具挑战性的语义细分任务进行彻底评估。将NAS应用于语义分割的主要挑战来自两个方面:(i)要处理的高分辨率图像; (ii)针对自动驾驶等应用的实时推理速度(即实时语义细分)的其他要求。为了应对此类挑战,我们在本文中提出了一种替代辅助的多目标方法。通过一系列自定义预测模型,我们的方法有效地将原始的NAS任务转换为普通的多目标优化问题。然后是用于填充选择的层次预筛选标准,我们的方法逐渐实现了一组有效的体系结构在细分精度和推理速度之间进行交易。对三个基准数据集的经验评估以及使用华为地图集200 dk的应用程序的实证评估表明,我们的方法可以识别架构明显优于人类专家手动设计和通过其他NAS方法自动设计的现有最先进的体系结构。
translated by 谷歌翻译
半监督学习(SSL)通过利用大量未标记数据来增强有限标记的样品来改善模型的概括。但是,目前,流行的SSL评估协议通常受到计算机视觉(CV)任务的约束。此外,以前的工作通常从头开始训练深层神经网络,这是耗时且环境不友好的。为了解决上述问题,我们通过从简历,自然语言处理(NLP)和音频处理(AUDIO)中选择15种不同,具有挑战性和全面的任务来构建统一的SSL基准(USB),我们会系统地评估主导的SSL方法,以及开源的一个模块化和可扩展的代码库,以对这些SSL方法进行公平评估。我们进一步为简历任务提供了最新的神经模型的预训练版本,以使成本负担得起,以进行进一步调整。 USB启用对来自多个域的更多任务的单个SSL算法的评估,但成本较低。具体而言,在单个NVIDIA V100上,仅需要37个GPU天才能在USB中评估15个任务的FIXMATCH,而335 GPU天(除ImageNet以外的4个CV数据集中的279 GPU天)在使用典型协议的5个CV任务上需要进行5个CV任务。
translated by 谷歌翻译
深度度量学习(DML)有助于学习嵌入功能,以将语义上的数据投射到附近的嵌入空间中,并在许多应用中起着至关重要的作用,例如图像检索和面部识别。但是,DML方法的性能通常很大程度上取决于采样方法,从训练中的嵌入空间中选择有效的数据。实际上,嵌入空间中的嵌入是通过一些深层模型获得的,其中嵌入空间通常由于缺乏训练点而在贫瘠的区域中,导致所谓的“缺失嵌入”问题。此问题可能会损害样品质量,从而导致DML性能退化。在这项工作中,我们研究了如何减轻“缺失”问题以提高采样质量并实现有效的DML。为此,我们提出了一个密集锚定的采样(DAS)方案,该方案将嵌入的数据点视为“锚”,并利用锚附近的嵌入空间来密集地生成无数据点的嵌入。具体而言,我们建议用判别性特征缩放(DFS)和多个锚点利用单个锚周围的嵌入空间,并具有记忆转换转换(MTS)。通过这种方式,通过有或没有数据点的嵌入方式,我们能够提供更多的嵌入以促进采样过程,从而提高DML的性能。我们的方法毫不费力地集成到现有的DML框架中,并在没有铃铛和哨声的情况下改进了它们。在三个基准数据集上进行的广泛实验证明了我们方法的优势。
translated by 谷歌翻译
最近,几种基于空间内存的方法已经验证了将中间框架及其面具作为内存有助于将视频中的目标对象细分目标对象。但是,它们主要集中于当前帧和内存框架之间的更好匹配,而无需明确关注内存质量。因此,较差的分割面罩的框架容易被记住,这导致了分割掩盖误差问题并进一步影响分割性能。此外,随着帧数的增长,内存框架的线性增加还限制了模型处理长视频的能力。为此,我们提出了一个质量感知的动态内存网络(QDMN)来评估每个帧的分割质量,从而使内存库可以选择性地存储准确的分段框架,以防止误差积累问题。然后,我们将细分质量与时间一致性相结合,以动态更新内存库以提高模型的实用性。我们的QDMN没有任何铃铛和哨子,在戴维斯和YouTube-Vos基准测试中都取得了新的最新性能。此外,广泛的实验表明,提议的质量评估模块(QAM)可以作为通用插件应用于基于内存的方法,并显着提高性能。我们的源代码可在https://github.com/workforai/qdmn上找到。
translated by 谷歌翻译
随着在充满挑战的环境中越来越需要多机器人探索未知区域的需求,需要有效的协作探索策略来实现此类壮举。可以部署基于边界的快速探索随机树(RRT)探索来探索未知的环境。然而,它的贪婪行为导致多个机器人探索收入最高的地区,从而导致勘探过程中大规模重叠。为了解决这个问题,我们提出了基于时间内存的RRT(TM-RRT)探索策略,用于多机器人在未知环境中执行强大的探索。它根据每个机器人的相对位置计算分配的每个边界的自适应持续时间,并计算边界的收入。此外,每个机器人都配备了由分配的边界和舰队共享的内存,以防止重复对同一边界的分配。通过模拟和实际部署,我们通过在25.0m x 540m(1350.0m2)区域完成勘探,展示了TM-RRT勘探策略的鲁棒性,而常规的RRT勘探策略则不足。
translated by 谷歌翻译